27 research outputs found

    Semantic Segmentation of Pathological Lung Tissue with Dilated Fully Convolutional Networks

    Full text link
    Early and accurate diagnosis of interstitial lung diseases (ILDs) is crucial for making treatment decisions, but can be challenging even for experienced radiologists. The diagnostic procedure is based on the detection and recognition of the different ILD pathologies in thoracic CT scans, yet their manifestation often appears similar. In this study, we propose the use of a deep purely convolutional neural network for the semantic segmentation of ILD patterns, as the basic component of a computer aided diagnosis (CAD) system for ILDs. The proposed CNN, which consists of convolutional layers with dilated filters, takes as input a lung CT image of arbitrary size and outputs the corresponding label map. We trained and tested the network on a dataset of 172 sparsely annotated CT scans, within a cross-validation scheme. The training was performed in an end-to-end and semi-supervised fashion, utilizing both labeled and non-labeled image regions. The experimental results show significant performance improvement with respect to the state of the art

    Text detection in images and videos

    No full text
    The goal of a multimedia text extraction and recognition system is filling the gap between the already existing and mature technology of Optical Character Recognition and the new needs for textual information retrieval created by the spread of digital multimedia. A text extraction system from multimedia usually consists of the following four stages: spatial text detection, temporal text detection – tracking (for videos), image binarization – segmentation, character recognition. In the framework of this PhD thesis we dealt with all the stages of a multimedia text extraction system, focusing though on the designing and development of techniques for the spatial detection of text in images and videos as well as methods for evaluating the corresponding result. The contribution of the thesis to the research area of multimedia text extraction lies to the proposition of generic methods for spatial detection of unconstrained text in images and videos regardless of their content, quality and resolution. In addition, two methods for the evaluation of the text detection result were proposed that deal successfully with the problems of the related literature. Each of them uses different criteria for the evaluation of the result while both of them are based on intuitively correct observations. Finally, a very efficient method was developed for the temporal detection of text which actually conduces to a better spatial detection while concurrently enhances the quality of the text image.Ο σκοπός ενός συστήματος εξαγωγής και αναγνώρισης χαρακτήρων σε πολυμεσικά έγγραφα έγκειται στη γεφύρωση του χάσματος μεταξύ της ήδη υπάρχουσας και ώριμης τεχνολογίας Οπτικής Αναγνώρισης Χαρακτήρων και των νέων αναγκών ανάκτησης πληροφορίας κειμένου που δημιουργούνται από την ραγδαία εξάπλωση των ψηφιακών πολυμέσων. Ένα σύστημα εξαγωγής πολυμεσικού κειμένου συνήθως αποτελείται από τα εξής τέσσερα στάδια: χωρικός εντοπισμός κειμένου, χρονικός εντοπισμός - παρακολούθηση κειμένου (σε βίντεο), δυαδική μετατροπή εικόνας κειμένου, κατάτμηση και αναγνώριση χαρακτήρων. Στα πλαίσια της συγκεκριμένης διδακτορικής διατριβής μελετήθηκαν ξεχωριστά όλα τα στάδια της εξαγωγής χαρακτήρων από πολυμέσα, δίνοντας μεγαλύτερη έμφαση στην ανάπτυξη τεχνικών για το χωρικό εντοπισμό κειμένου σε βίντεο και φωτογραφίες καθώς και σε τεχνικές αποτίμησης του αποτελέσματος. Η συνεισφορά της διατριβής έγκειται κυρίως στην πρόταση γενικευμένων μεθόδων για χωρικό εντοπισμό κειμένου κάθε είδους σε φωτογραφίες και βίντεο οποιασδήποτε ποιότητας, ανάλυσης και περιεχομένου. Στα πλαίσια της διατριβής προτάθηκαν επίσης δύο μέθοδοι αποτίμησης του χωρικού εντοπισμού κειμένου που αντιμετωπίζουν επιτυχώς μια σειρά προβλημάτων της σχετικής βιβλιογραφίας. Κάθε μια από αυτές χρησιμοποιεί διαφορετικό κριτήριο για την αξιολόγηση του αποτελέσματος ενώ και οι δύο βασίζονται σε διαισθητικά ορθές παρατηρήσεις. Τέλος, αναπτύχθηκε μια πολύ αποδοτική μέθοδος χρονικού εντοπισμού ακίνητου κειμένου σε βίντεο η οποία παράλληλα συμβάλλει στον καλύτερο χωρικό εντοπισμό αλλά και βελτιώνει την ποιότητα της εικόνας κειμένο

    A Hybrid System for Text Detection in Video Frames

    No full text
    This paper proposes a hybrid system for text detection in video frames. The system consists of two main stages. In the first stage text regions are detected based on the edge map of the image leading in a high recall rate with minimum computation requirements. In the sequel, a refinement stage uses an SVM classifier trained on features obtained by a new Local Binary Pattern based operator which results in diminishing false alarms. Experimental results show the overall performance of the system that proves the discriminating ability of the proposed feature set

    Food Recognition for Dietary Assessment Using Deep Convolutional Neural Networks

    No full text
    Diet management is a key factor for the prevention and treatment of diet-related chronic diseases. Computer vision systems aim to provide automated food intake assessment using meal images. We propose a method for the recognition of already segmented food items in meal images. The method uses a 6-layer deep convolutional neural network to classify food image patches. For each food item, overlapping patches are extracted and classified and the class with the majority of votes is assigned to it. Experiments on a manually annotated dataset with 573 food items justified the choice of the involved components and proved the effectiveness of the proposed system yielding an overall accuracy of 84.9%

    MULTIRESOLUTION TEXT DETECTION IN VIDEO FRAMES

    No full text
    Abstract: This paper proposes an algorithm for detecting artificial text in video frames using edge information. First, an edge map is created using the Canny edge detector. Then, morphological dilation and opening are used in order to connect the vertical edges and eliminate false alarms. Bounding boxes are determined for every non-zero valued connected component, consisting the initial candidate text areas. Finally, an edge projection analysis is applied, refining the result and splitting text areas in text lines. The whole algorithm is applied in different resolutions to ensure text detection with size variability. Experimental results prove that the method is highly effective and efficient for artificial text detection.

    Dish Detection and Segmentation for Dietary Assessment on Smartphones

    No full text
    Diet-related chronic diseases severely affect personal and global health. However, managing or treating these diseases currently requires long training and high personal involvement to succeed. Computer vision systems could assist with the assessment of diet by detecting and recognizing different foods and their portions in images. We propose novel methods for detecting a dish in an image and segmenting its contents with and without user interaction. All methods were evaluated on a database of over 1600 manually annotated images. The dish detection scored an average of 99% accuracy with a .2s/image run time, while the automatic and semi-automatic dish segmentation methods reached average accuracies of 88% and 91% respectively, with an average run time of .5s/image, outperforming competing solutions

    Classification of interstitial lung disease patterns using local DCT features and random forest.

    No full text
    Over the last decade, a plethora of computer-aided diagnosis (CAD) systems have been proposed aiming to improve the accuracy of the physicians in the diagnosis of interstitial lung diseases (ILD). In this study, we propose a scheme for the classification of HRCT image patches with ILD abnormalities as a basic component towards the quantification of the various ILD patterns in the lung. The feature extraction method relies on local spectral analysis using a DCT-based filter bank. After convolving the image with the filter bank, q-quantiles are computed for describing the distribution of local frequencies that characterize image texture. Then, the gray-level histogram values of the original image are added forming the final feature vector. The classification of the already described patches is done by a random forest (RF) classifier. The experimental results prove the superior performance and efficiency of the proposed approach compared against the state-of-the-art

    Smartphone-based Urine Strip Analysis

    No full text
    Point-of-care testing (POCT) has transformed the healthcare landscape by delivering quick and cheap diagnostic services closer to the patient. Urine test strips are one of the most commonly used POCT tools, however their manual interpretation can be challenging, particularly for the elderlies and people with eye disorders. In this study, we propose a smartphone application designed to automatically perform semi-quantitative colorimetric analysis on urine strips by using just one image of the strip, placed on a specially designed reference card. Virtually no shooting restrictions apply while the system is able to adapt to different smartphones and varying ambient light conditions. For the detection of the card, we match ORB keypoints between the captured image and the reference card in memory, and use RANSAC to calculate the projective transform. Then, the strip’s rectangular pads are precisely detected and their chromaticity CIELUV components are compared to the manufacturer’s reference after being corrected using SVM regression. We tested the application on the analysis of pH values, using three different smartphones on various lighting conditions and obtained promising results that prove its concept. Future work includes the extension of the system to perform fully quantitative analysis to additional analytes and colorimetric strips
    corecore